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Abstract. In this paper we define and study a generalized version of facility location problem in 
which facility cost functions depend on the number of clients assigned to the facility. There is an 
associated cost function for each facility that depends on the number of clients assigned to it. We 
focus on the case of concave facility cost functions, and present greedy 1.94 and 1.52 approximation 
algorithms for this case. We will also consider various generalizations and variants of the problem and 
give an O(logn) approximation algorithm for the non-metric generalized facility location problem. 

1 Introduction 

The facility location problem is a central problem in operation research. In this problem, we have a set of 
clients and a set of facilities, and we want to connect each client to a facility in a way that minimizes the 
total cost. There are two types of costs associated with a solution: the connection costs and the facility 
costs. The connection cost between each client j and facility i is a number Cy that is given. We will have 
to pay this amount if we want to connect client j to facility i. Each facility i also has a facility cost fi, that 
is the amount we have to pay if we decide to connect at least one client to it. The cost of a solution is the 
summation of the connection costs and the facility costs associated with the solution. We will generally 
assume that the connection costs obey the metric inequality. 

The facility location problem is well-studied in the field of approximation algorithms, and a number of 
different approximation algorithms have been proposed for this problem using a variety of techniques [11, 
14,6,7,10,8,1-5,8,9,13]. 

A natural generalization of the facility location problem is to allow the facility cost of a facility to be an 
arbitrary concave function of the number of cities connected to it. This question was first asked by Tom 
Leighton and was motivated by its applications in placing servers on the Internet. The reason it is assumed 
that the facility cost function is concave is the following principle of economics: As the number of clients 
increases, the cost per client will decrease, since they share some common expenses. 

The main result of this paper is a 1.94 approximation algorithm for the concave facility location problem. 
Our algorithm is a natural generalization of an algorithm by Jain et al. [7]. The analysis uses the techniques 
of dual-fitting and factor-revealing LPs introduced in [6]. We will also consider more general versions of 
the problem by relaxing the conditions of concavity for facility costs and metricity for connection costs. 

This paper is organized as follows. In section 2, we give a formal definition of the problem, and observe 
that it can be reduced to the capacitated facility location problem with hard capacities. In section 3, we 
present a greedy 1.94 approximation algorithm for the concave facility location problem. In section 4, we 
prove the approximation factor 1.94 for our greedy algorithm. In sections 5 and 6, we study generalizations 
and variants of the problem such as allowing the facility cost function to be slightly non-concave, facility 
location with convex cost functions, and relaxing the metric inequality for the connection costs. 

2 The problem 

The generalized facility location problem is define as follows. We are given a set F of facilities (a.k.a servers); 
a set V of clients (a.k.a. cities or demands); & facility cost function /, : TV — >• TV for every facility i € J 7 , which 
specifies the cost of the facility as a function of the number of clients served by it; and finally connection 
costs Cij between facility i and client j. We generally assume that the connection costs are metric (i.e., is 



symmetric and satisfies the triangle inequality), unless otherwise stated. The objective of the problem is 
to find an assignment tp of all clients to facilities (i.e., tp : V i-> T) with minimum total cost. The total 
cost is the sum of connection costs (X!?ex> c ip(j),j) an< ^ facility costs (YlitF fi(\{J e ^ : V'(i) = *}|))- 
The focus of this paper is on the concave facility location problem, i.e., the generalized facility location 
problem with a guarantee that all facility cost functions are concave. Recall that a function / : N —> N is 
called concave if and only if for each x > 1, f(x + 1) — f(x) < f(x) — f(x — 1). 

We observe that the concave facility location problem can be reduced to the capacitated facility location 
with hard capacities. In the capacitated facility location problem, each facility i also has a capacity Ui, 
which is the maximum number of clients that can be served by this facility. The problem has two variants; 
in the first one, the capacities are hard in that facility i can be opened at most once to serve at most Ui 
demands; in the second variant the capacities are soft] that is, facility i may be opened k times to serve up 
to kui demands at a cost of kfi. It is easy to observe that both of these problems are special cases of the 
generalized facility location problem. The capacitated facility location problem with soft capacities can be 
reduced to the uncapacitated facility location problem [11], and the best known approximation algorithm 
for this problem [11] is obtained via this reduction. However, hard capacities are known to be problematic 
for techniques that rely on linear programs (such as LP-rounding and primal-dual algorithms). The main 
problem is that natural linear program has a large integrality gap for the case of hard capacities. Recently, 
Pal et al. [12] used local search techniques to give the first constant factor approximation algorithm for 
this problem, achieving a factor of 9 + e. However, the running time of this algorithm is prohibitive for 
applications with large data sets. 

Proposition 1. Assume there is a polynomial time a- approximation algorithm for the capacitated facility 
location problem with hard capacities. Then the concave facility location problem can be approximated within 
a factor of a in polynomial time. 

Proof. For each facility i € J 7 , we place n copies of this facility, where the j'th copy has opening cost 
fi(j) and capacity j. We use the a-approximation algorithm for the capacitated facility location problem 
to solve this instance. Since the facility cost function is concave, we can assume without loss of generality 
that at most one of these n copies is opened in this solution. Therefore, the solution to the capacitated 
facility location instance can be easily transformed to a solution for the concave facility locatino problem 
with the same total cost. 

The above proposition together with the algorithm of Pal et al. [12] gives a 9 + e-approximation algorithm 
for the concave facility location problem. In the next section, we will show a greedy algorithm for this 
problem achieving a factor of 1.94. 

3 The Greedy Algorithm 

In this section, we present the main result of this paper that is a greedy 1 .94-approximation algorithm 
for the concave facility location problem. The approximation factor of this algorithm is much better than 
the algorithm mentioned in the proof of Theorem 1. In addition, since the previous algorithm uses the 
approximation algorithm of the capacitated facility location problem with hard capacities, and the current 
algorithm for this problem uses the local search method, its running time is very high, especially because 
we reduce an instance of concave facility location with 0(n) facilities to an instance of capacitated facility 
location with 0(n 2 ) facilities. Our method is an extension of the method used by Jain et al. [7] and 
Mahdian et al. [10]. First, we define stars. 



Definition 1. A star consists of one facility and several clients. For star S, consisting of clients a\, 02, . . . , a*, 
and facility p, cost of S, denoted by c(S) or cs, is defined as J2i=i c p,ai + Ip(^) *- e v the sum of the con- 
nection costs of clients to the facility p plus the facility cost function of p for k clients. 



Suppose an algorithm finds a solution of cost 6 to the concave facility location problem, and also it finds 
values dj for every client j € C as the contribution of client j in the total cost. In addition, assume 



YljeC a i = @ an( ^ there is a constant 7 > 1 such that for every star S, J^jesnC a i — 7 Cs- ^ e now cons ider 
an optimal solution OPT to the problem. Let i be a facility that is opened in OPT. For set D of clients that 
are connected to facility i in OPT, we can write YIij^d a i — 7(/« + ^jeD c ij) or ^jeD a j — 7 c s where S 
is the star consisting of i and D. By summing up the inequalities for every star that is picked in OPT, we 
obtain 6 = X^- 6 C a » — 7 SseOPT Cs = 7 " cost(OPT). Therefore if we can find such an algorithm with a 
constant 7, 7 is the approximation factor of the algorithm. 

It is worth mentioning that this approach can also be considered using LP-duality. The problem can be 
formalized by an integer linear program based on stars, and the cti's are the variables of the dual program 
in which we relax inequalities by a constant factor 7. The reader is referred to Jain et al. [7] to see this 
method called the dual-fitting method in more details. 

In the next section using the approach discussed above, we show that simple greedy Algorithm A pre- 
sented below is a 1.94 approximation algorithm for the concave facility location problem. In this algorithm, 
we use a notion of time (lines 1 and 6) , such that every event can be associated with the time at which it 
happened. Also each client j has a budget from which it can offer some money to facilities; if j is uncon- 
nected and its budget is more than the cost of the connection to a facility i, it offers the extra budget to 
i (line 9); and if j is connected to a facility i', it offers to a facility i the amount by which it can save by 
switching its facility from i 1 to i (line 10). We note that at any time, the budget of each connected client 
is equal to its current connection cost plus its total contribution toward open facilities. 

Algorithm A: greedy algorithm for concave facility location 

Input: Metric connection costs Cij for each facility i and client j. 

Concave facility cost functions /, : iV — » iV for each facility i. 
Output: For each client j, a facility p(j) to which j is assigned. 

For each client j, contribution of client ,;' to the total cost (aj). 
begin 

1 let t = 

2 for each facility i let level p = 

3 for each client ,;' 

4 let p(j) =null 

5 let budget (j) = 

6 while there is an unconnected client increase time t 

7 for each unconnected client ,;' let budget(j) = t 

8 for each client ,;' 

9 if p(j) =null let offer(j,i) = max(budget( t ;') — Cy,0) 

10 else let offer(j, i) = max(cp(jjj — Cij,0) 

11 while there is a facility i and k — level, clients ai, • • • , a^.ievei. (k > level,) 

which contains at least one unconnected client and Y] ■~ 1 l offer(o,-,») = fi(k) — /,(level,) 

12 let level, = k 

13 for each 1 < j < k — level, let p(a,j) = i 

14 for each client ,;' 

15 let a(j) = budget(j), the time that ,;' first gets connected 
end 

The proof of the following lemma is clear from the algorithm and the discussion above. 

Lemma 1. The total cost of the solution found by Algorithm A is equal to the sum of aj ! s. 

The above algorithm is similar to the greedy algorithm of Jain et al. [7]. The difference is that here we 
define the concept of level for facilities that is the number of clients assigned to it and the events in which 
we assign clients to facilities depend on the level of vertices. 

4 The approximation factor 

In this section, we show that Algorithm A is indeed a 1.94-approximation algorithm for the concave 
facility location problem. We prove this by showing that for each star S, the ratio of the sum of a/s of all 



clients contained in S to the total cost of S is at most 7 rj 1.94. Analogous to the work of Jain et al. [7], 
our approach is as follows. First, based on the behavior of the algorithm we obtain some linear constraints 
called factor revealing-LP on q,'s and the cost of S. Next, we show that for any feasible solution of the LP 
(not necessarily for the one obtained from the algorithm) our objective ratio is at most 7 rj 1.94. Here, 
an LP-solver helps us to guess such a ratio, and then using complicated calculations we prove this upper 
bound. 

To derive the factor-revealing LP, first we need some definitions and notations. Consider a star S consisting 
of a facility p and k clients numbered 1 through k. Let dj denote the connection cost between facility p 
and client j, and aj denote the share of j of the total expenses (see the definition of a in the algorithm). 
The cost of the star is f p (k) + J2i=i di- For simplicity, we set / = f p (k). Without loss of generality, we 
assume a\ < a-i < • • • < a*. Let the critical time for a client i be the time just before i gets connected for 
the first time, i.e., when t = at — e where e is very small. At the critical time for client i, each of the clients 
1, 2, . . . , i — 1 might be connected to a facility. For every j < i, if client j is connected to some facility at 
time t, let r^ denote the connection cost between this facility and client j; otherwise, let r^ := aj (in 
this case at = aj). 

First we note that since the budget of a client remains constant when it gets connected to a facility, and it 
may not get connected to another facility with a higher connection cost, rjj+i > i" 3,3+2 > ■ ■ ■ > fj,fc- Now 
we obtain more constraints. 

Lemma 2. At the critical time for a client i, for every subset Si C {l,...,i — 1} and every subset 
S 2 c{i,...,k} (S 2 ?Q), 

^2 m&x ( r 3,i ~ d J'°) + X] max ( a i ~ d J'°) < fp(\ S l\ + I^D- (1) 

jeSi jes 2 

In particular, J2)=i max(rj,j — dj, 0) + J2j=i max(ai — dj, 0) < f p (k) = f. 

Proof. The amount client j offers to facility / at time t = oii — e is max(rj i j — dj, 0) if j < i, and max(t— dj, 0) 
if j > i. By the definition of r^ this holds even if j < i and c<i = aj. From the algorithm, the total offer 
of clients in S\ U ^2 to facility p may not become larger than the cost of facility p at level |5i| + IS2I, 
since otherwise all these clients were assigned to the facility p. Thus, for all i, J2jeS! m a x ('*j,j — dj,0) + 
YljES 2 max ( a i — dj,0) < f p (\Si\ + IS2I). In particular by setting Si = {1, . . . , i — 1} and S2 = {i, ■ ■ ■ , k}, 
J2 Z jl\ max(r jti - dj , 0) + Y!j= % max(ai - dj,0) < f. D 

So far, we have not used the triangle inequality of connection costs and concavity of facility cost functions. 
We use these assumptions in the next lemma. 

Lemma 3. At the critical time for a client i, for all clients j such that 1 < j <i, 

cti < aj + rj t i + di+ dj. (2) 

Proof. Let p' be the facility that j is connected to at time t = ai — e. By the triangle inequality and 
the definition of Tj^, the connection cost c p n between client i and facility p' is at most c p <j + di + dj. It 
is not hard to see that by the definition of r^, we have c p <j < Tjj. Thus these two inequalities imply 

Cp'i j^ fj,i ~r 0>i ~r Clj. 

Furthermore, if the level of p' is equal to / when client j gets connected to it, c p <, can not be less than 
t + f p < (I + 1) — f p < (I), since otherwise the client i should be connected to the facility p' at a time earlier than 
t, which is a contradiction. This shows that: ai — e = t < c p <i + f p < (I + 1) — f p < (I). From the last inequality 
and this one, ai < Tjj +di+ dj + f p < (I + 1) — f p < (I). From the fact that f p < is a concave function, it turns 

out that for all q < I, f p i (/ + 1) — / p < (/) < p ' \_ p ' . Now consider the time aj at which j gets connected 
to facility p' . Let q be the level of p' before time aj and at this time / — q new clients, b\, 62, ■ ■ ■ , bi- q are 
assigned to facility p' . At time aj the total amount of these l — q clients' offer top 1 is equal to f P '(l) — f P '(q). 
The amount of 6,'s offer to facility p' is equal to either aj — ct, iP < or ct, iP « — ct, iP < depends on the situation 



of client bi at time ay (whether it was assigned to a facility p" or not). In either case, 6,'s offer is less than 
or equal to ay — Ct, iP <, thus 

U (0 - U (V) = E offer(6,) < £ aj = (I - q) aj =* kRlJM < Q . 

Combining all these inequalities together, we get the following inequality. For every 1 < j < i < k, 
oti<rji+ di+ dj + a,-. □ 



The following optimization program, called the factor-revealing LP, can be obtained from the above in- 
equalities. We note that by scaling / + J2i=i d% 
function max we can obtain a linear program. 



equalities. We note that by scaling / + J2i=i di = 1 and introducing new variables and new constraints for 



Ek 
maximize —= (3) 

f + Elidi 
subject to V 1 < i < k : a, < aj+i 

V 1 < j < i < k : r j:i > r j:i+1 

V 1 < j < i < k : ai <aj + r^i + di + dj 

V 1 < i < k, Si C {1, . . . , i — 1}, 62 C {i, . . . , A;} : >J max(rj i i — dj, 0) 

+ Y, max («i - d h 0) < /p(|5i I + |5a|) ' 
jes 2 
V 1 < j < i < k : a>j,dj,f, r j:i > 

The size of the above program is large (exponential) because of the forth set of inequalities and it is hard 
to find out the solution of the problem for large fc's. In order to solve this problem, we observed that using 
Lemma 2, we can relax the forth set of inequalities and still get the approximation factor 1.94. 

Ek 
i— 1 ®i ( A \ 

maximize -^. (4) 

f + E- =1 di 

subject to V 1 < i < k : a, < aj+i 

V 1 < j < i < k : r j:i > r j:i+1 

V 1 < j < i < k : at <aj + Tj^ + di + dj 

i-l k 

V 1 < i < k : >J max(rj i i — dj, 0) + >J max(«j — dj, 0) < / 

i=i j=i 

V 1 < j < i < k : aj,dj,f, r j:i > 

After this relaxation, the number of inequalities in the optimization program is polynomial. 

Theorem 1. Let 7 be s\ip k {zk}, where Zk is the solution of the factor-revealing LP. Then Algorithm A 
gives a ^-approximation algorithm for the concave facility location problem. 

Proof. Since the values ai, dj, f and r^ obtained from Algorithm A satisfy all inequalities in the LP, 
the values of the objective function for them is at most z k . It implies for each star S consisting of one 
facility and k clients i\, ■ ■ ■ ,ik, J2j=i a ij ^ s a ^ mos t ZkCs- The proof of the lemma follows from this fact 
and Lemma 1. □ 

In the next step, we use an LP-solver like CPLEX to obtain the optimum solution of the factor revealing 
LP for fixed k. The results are shown in Table 1. 



k 


maxj<fc Zi 


10 

20 

50 

100 

200 


1.83517 
1.88389 
1.91573 
1.92652 
1.93193 



Table 1. Solutions of the factor-revealing LP 



Using these experimental results one can observe that 7 rj 1.94; however all of them are just lower bounds 
for the LP. To obtain the desired approximation ratio, we need to prove an upper bound 1.94 on the 
maximum solution of the LP. The proof needs so much calculations, and thus it is presented in Appendix 
A. It is worth mentioning Mahdian et al. [7, 11] also use these kinds of tedious calculations for the facility 
location problem; however since our LP is different from theirs, the calculations are different. Finally we 
have the following theorem: 

Theorem 2. Algorithm A is a 1.94- approximation algorithm with running time 0(n 4 ) for the concave 
facility location problem, where n = max(n/,n c ). 

The following improvement for the concave facility location problem has been suggested by one of the 
referees of SODA 2003. It is worth mentioning, still we use Algorithm A for more general connection 
costs in Section 6. 

Theorem 3. There exists a 1.5 2- approximation algorithm for the concave facility location problem. 

Proof. The main idea is that in our problem any concave function f(x) can be represented by mirik{(f(k) — 
f(k— l))x + kf(k—l) — (k— l)f(k)}, 1 < k < n. Now, we take each facility i with concave cost function /j, 
and replace it by multiple facilities i\, ii, ■ ■ ■ ,i n such that facility ik has opening cost kfi(k—l) — (k—l)fi(k). 
In addition, each unit of demand routed to this location costs fi(k) — fi{k — 1) extra at the facility. This 
cost fi(k) — fi(k — 1) can be added to the distance metric. For this facility location problem, Mahdian et 
al. [11] have a 1.52 approximation. 

5 Generalizations 

In this section, we consider more general variants of the problem such as the problem with relaxed metric 
inequality. 

5.1 Other facility cost functions 

So far, we have shown approximation algorithms for the concave facility location problem. In this section, 
we consider the generalized facility location problem with more general facility cost functions. First, we 
give a polynomial time algorithm for the generalized facility location problem with convex facility cost 
functions. Recall that a function / is called convex if for every x > 1, f(x + 1) — f(x) > f(x) — f(x — 1). 

Theorem 4. The generalized facility location problem with convex facility cost functions can be solved in 
polynomial time. 

Proof. First we reduce the problem to the capacitated facility location problem with unit hard capacities. 
Next, we show how this problem can be solved in polynomial time. For each facility i € J 7 , we place n 
copies of unit-capacity facilities where f\ , the opening cost of the jth facility of the ones corresponding 
to facility i € J 7 , is fi(j + 1) — fi(j), < j < n — 1. The correctness of the reduction follows from the 
fact that if we use a facility //, we should use all facilities /*, k < j, since function /, is convex. We now 



solve the problem by minimum weighted matching on bipartite graphs. We construct a bipartite graph 
G = (X UY,E) as follows. For each client j, we place a vertex in the set X, for each facility i, we place a 
vertex in the set Y, and finally we place an edge {j, i} in E between a client j and a facility i with weight 
c ij + fi- We can easily observe by solving minimum weighted matching [15] for G, one can find an optimum 
assignment for the original problem. □ 

The problem of finding a constant factor approximation algorithm for general facility cost functions is 
open. Here, we observe that Algorithm A can be used to find constant factor approximation algorithms 
for cost functions that are close to concave functions. 

Definition 2. A function f : N — > N is a c-close concave function if there exists a concave function g 
such that Vie JV: ^p- < f{x) < g{x). The function f : N —> N is c-concave if and only if for all I and q 
such that q<l,we have f(l + 1) - /(/) < c f{l )~J q (q) ■ 

In the case of c-close concave functions, we have the following simple theorem. 

Theorem 5. There is a constant factor approximation algorithm of factor 1.94c for the generalized facility 
location problem with cost functions that are c-close to concave functions. 

Proof. Consider a function <?j such that ^ < fi < gt- We use Algorithm A to solve the problem for facility 
cost functions g^s. We know that the cost of this solution is at most 1.94 times the optimal solution for 
facility cost functions g^s. Using the inequality ^f < /« < gi, the optimal solution for <?j's is at most c 
times the optimal solution for /j's. Thus, the approximation factor is 1.94c. 

For c-concave functions, one can observe that by a similar proof of Lemma 3, we can prove that if /j's 
are c-concave, then for all 1 < j < i < k, a, < coij + Tjj + di + dj. Therefore, using Algorithm A 
for functions /j's, the approximation factor of the algorithm is the optimal solution of the same factor 
revealing LP except the third set of inequalities which are replaced by ctj < coij + Tjj + di + dj. Table 2 
shows a summary of the results obtained by solving the factor-revealing LP using CPLEX for k = 100. 
From the experimental results, it turns out that Zk depends on c by an asymptotically linear function and 
the approximation factor is again a constant. 



c 


maxj<*. Zi 


A 


max;<*. Zi 


0.2 

0.5 

1 

2 

10 


1.6579 

1.75595 

1.92652 

2.29422 

5.98046 


0.2 

0.5 

2 

10 

50 


1.3348 
1.6102 
2.4456 
4.4621 
5.0569 



Table 2. Approximation factor for c-concave functions and A-parameterized metric(/c = 100) 



5.2 More general connection costs 

It is easy to see that the facility location problem (and therefore the generalized faciltiy location problem) 
is NP-hard to approximate within a factor less than O(lnn) if the connection costs are not metric. Also, 
it is not difficult to see that the classical set cover algorithm can approximate the non-metric concave 
facility location problem within a factor of O(lnn). However, one can observe that Algorithm A works 
very well when the metric inequality is somewhat relaxed. In this case, instead of the triangle inequality 
(AC < AB+BC) a parameterized triangle inequality (AC < X(AB+BC)) is satisfied. It is straightforward 
to restate the proof of Lemma 3 and prove that for all 1 < j < i < k, on < X(rji + di + dj) + oij. 
Therefore, we have the same factor-revealing LP except the third set of inequalities which are replaced by 
on < X(rji +di+dj) +oij. Using CPLEX, we obtained the optimum solution of this factor-revealing LP for 
k = 100. Table 2 shows the empirical results for different values of A. These results show that Algorithm 
A works well when the metric inequality is somewhat relaxed. 
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6 General connection costs 

It is easy to see that the facility location problem (and therefore the generalized faciltiy location problem) 
is NP-hard to approximate within a factor less than O(lnn) if the connection costs are not metric. Also, it 
is not difficult to see that the classical set cover algorithm can approximate the non-metric concave facility 
location problem within a factor of O(lnn). However, this reduction does not work for the non-metric 
generalized facility location problem. For this case, we can prove that Algorithm A has an approximation 
factor of Inn, where n is the number of clients. 

Theorem 6. Algorithm A achieves an approximation factor of Inn for the generalized facility location 
problem when the connection cost is non-metric. 

Proof. Recall the definitions of aj,dj, and / in Section 4. By Lemma 2, we have J2j=i( a i ~ dj) < f (Notice 
that the concavity assumption was not used in the proof of Lemma 2). Thus, 



k k 

^<^ TT (/+E^)<^ T i(/ + E^ 

2=1 j=l 



It follows that 



k 



X> ^ E fe _j + i (/ + I>) = H k (f + Y,dj) < (lnn)cs. 

i=\ i=\ j=l j=\ 

The above theorem implies that the capacitated facility location problem can be approximated by a factor 
of Inn. To the best of our knowledge, this is the first approximation algorithm for the (hard) capacitated 
facility location problem when the connection cost is non-metric. Also, it is not difficult to observe that 
if instead of the metric inequality, connection costs satisfy a relaxed version of the metric inequality, then 
by proving a relaxed version of the inequality in Lemma 3 and solving the corresponding factor-revealing 
LP, one can obtain the approximation factor of the algorithm. 

Acknowledgments. We would like to thank Tom Leighton, Rajmohan Rajaraman and Ravi Sundaram 
for introducing the problem. 
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A Proof of Theorem 3 

In order to prove this upper bound, first we prove the following lemma which allows us to prove the theorem 
on the case of sufficiently large k. The proof of the following lemma is the same as the proof of Lemma 14 
in [7] and hence omitted. 

Lemma 4. // z^ denotes the solution to the factor-revealing LP, then for every k, z^ < z-2k ■ 

Now, in order to prove the approximation factor, the objective is to combine the inequalities of the Program 
4 to derive an inequality of the form J2i=i a i < l(f + X^=i d i)- Such a 7 will be an upper bound on the 
solution of the Program 4. 

We start by relaxing the forth inequality of the Program 4 to the following inequality 

h i-1 

Y^( a i ~ d j) + X] max^-j - dj,0) < f, (5) 

j=i 3=1 

where as in [6] we define k as follows 

k=\l 2k Vz Pl i (8) 

[k if 1 > pik y ' 

Here p\ and P2 are constants with < pi < P2 < 1 which will be fixed later in the proof. 
Inequality 5 implies 

1 ( h i-i \ 

ai - i.-i + i I* + ^ d i ~ J2 max ( r J,i -dj,0)\. (7) 

* \ j=i 3=1 ) 

We multiply both sides of the above inequality by a constant 6i that will be fixed later, and add the 
resulting inequalities. This will imply the following inequality. 



k 



i=l i=l 

Let C an d Xj be defined as follows. 



0i { li i_1 \ 

-7-TT f + X! d i ~ 12 ma ^( r 3,iJ- d 3 . °) ' 
\ j=i 3=1 J 



J2 8iUi < y^ l ._ l i + \ f + ^2 d j-^2 max ( r 3,n- d 3>°) ' ( 8 ) 



1=1 









Therefore, Inequality 8 can be written as follows. 

In the above inequality, some 0j's are greater than 1 and others are less than or equal to 1. The next step 
is to use the inequality a, < oij + rjj + &\ + dj to make the coefficient of all Qj's on the left-hand side of 
the inequality equal to 1. We assume 0j's are chosen in a way that 

k 

£> = *. (12) 

i=l 

Also, we assume there is a constant p% such that 

Vi < p 3 k, 6i > 1 and Vi > p 3 fc, ^ < 1 (13) 

We will make sure that 0^'s satisfy the above constraints when we fix their values later in the proof. Now 
consider the inequality at < aj + rjj + di + dj for i > p$k and j < p 3 k. Multiply both sides of this 
inequality by a constant ujij > 0, and add up all these inequalities with Inequality 11. 

If we choose Wj/s in such a way that 

>J u)ij = 8 j — 1 Vj < p 3 k and >J LJij = 1 — 8% Vi > p 3 k (14) 

i>psk j<psk 

we will get the following inequality: 

K/ K/ K/ % — X /\ 

^2ai<(f + ^2(Xi + |1 - 0j|)dj + E E Uijrjj -EE y _ i ' +1 max ( r J.» _ d J>°) ( 15 ) 

z=l z=l i>P3k j<psk i=l j=l 

Now, we define P4 < pi so that we only use triangle inequalities c^ < Tji + aj +di + dj for i > p 2 k and P4& < 
j < Pzk or for P3& < i < p 2 k and j < p±k. From this definition of P4, first we need that 

P2.k p4k 

£ (l-«i) <£&-!) (16) 

j=P3k+l j=l 

and also we choose Wy 's such that < p\ < p\ and 

Wij j 1 =>■ (i > p2& and ^fc < j < p^k) or (^fc < i < P2& and j < ^fc) (17) 

Now, using the fact that Tji > r?j+i , we can write Inequality 15 as follows: 



p 3 k p 2 k 

«=1 «=1 i=p 2 k+l j=p 4 k+l i=p3k+l j<p 4 k j=l i=j+l 



8i 






k p$k k P4k p%k p 3 k k „ 

«=1 j=p 4 fc+l i=p 2 k+l j=l i=p 3 k+l j=l i=j+l l 

Using Equations 14 and 17, it turns out that J2i> P2 k W «.J = ®i ~ 1 f° r J — ^ 4 ^ an< ^ Sp3fc<i<p 2 fc W «.J = ®j~^ 
for P4& < j < P3A;. Thus, 
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k k p 3 k Pik 

1^«, - C/ < 2^(A, + |1 - ^|)^ + E ^W^-IHEtw^-I) 

%=i %=i j=p 4 fc+i j=i 

Vik k „ p 3 k k „ 

-EE /._j + i max ( r ^~^'°)~ E E /._j + 1 max ( r ^~^'°) 

j=l «=j+l J j=p 4 k+l i=j+l l 



Notice that for i > p 2 k, r^i < Tj }P2 k and for i > p$k , r^i < Tj }P3 k- After substitution of these values in the 
above inequality, we get the following: 

k k p 3 k Pik 

J Eu i -tf<^(\ i + \i-e i \)d i + E ^w^-ikE^^-i) 

i=l i=l j= Pi k+l j=l 

p 4 k p 3 k . p 3 k p 2 k . 

E E b-l+i to*'" - dj) - E E /._- + i (rj>2fc ~ dj) 



j=l i=j+l 
p 3 k 



j=P4,k+l i=j+l 



p 3 k 

E 

j=P4,k+l 



Pik 

E 



e, 



<Y J {\i + 2{e i -i))d i + E (a< + i-W + E *w(^-i- E k _l +l ) 

i=l i=p3k-\-l ' ' .■•,-. 

p 4 k p 3 k _ 



Now, if 0^'s satisfy 

E^+i I^+T > »i - 1 if P4* < i < p 3 k 
This implies 

fc P3* k 

E a ^c/ + E( Ai+2 (^ -!))*+ E (Ai + i-eo* 



(18) 



(19) 



Now, if we set the coefficients of dj's in the right-hand side of the above inequality to 7, we will get the 
following recurrence for 6f. 



Xi + 2(6i - 1) = 7 if i<p 3 fc 
Ai + 1 - Oi = 7 if i > pzk 



(20) 



Notice that A/s can be written as a function of ^'s (i < j) from Equation 10, thus the above equations 
are recurrence relations for 8;. 



Solving this recurrence, we get 

^^5 if Pik<i< P3 k 



6i = < 



(21) 



(3-20 P3 *)^f i{p3k<i<p 2 k 







if P2& < i < k 



If we can set the constants Pi,P2,P3,P4 in such a way that ^'s satisfy the Conditions 12, 13, 16, and 18 
then Inequality 19 shows that the solution of the factor-revealing LP is at most max((, 7). 

It's not hard to see that ^'s are decreasing from to p$k and increasing from p$k + 1 to P2k, thus in order 
to satisfy Conditions 13, it is sufficient to have: 



6 v , k > 1 and ff mk < 1 



7 p 3 k 



P:;fr 



(22) 
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In order to write Conditions 12 and 16, we compute the sum of ^'s for different intervals separately. 

p 3 k p 3 k 

£A i = ( 7 + 2)j**-2£0 i and 

3=1 3=1 

p 3 k p 3 k 3 „ p 3 k p 3 k „ p 3 k , . ■ , -,\n 



E Aj ' EE/ i + i Z-'^.h-i + i E h-i + i 

3=1 3=1 %=1 %=1 3=% %=1 

(P3 - P2)kVi yr^ (P3- l)kVi 



* + E, ,._ i + 1 + . E 



, . , h-i + l 

i=l i=l z=pik+l 

p 3 k 

= ^20i + (P3~ P2)k\ pi k + (P3 - l)k(X P3 k ~ X Vl k) 
i=l 
p 3 k k 

E 6i = Q ((T + 2 ^ 3 + (P2 _ ^^i* + ( X _ ^3)A P3fc ) 



«=1 

and also 



P2fc (l-pa)*-l 1 

X ^ = (3-20 P3fc )(l-p 3 )fc Yl ~i 

i=p 3 k+l i=(l—p 2 )k 



= (3 - 20 M *)(1 - p 3 )fc(ln ^— ^ + o(l)) 



1 ~P3 
1 -P2 



From these two equations, we can write equation 12 as follows: 



7 + 2 Vi > 1-P2. ,„ 0/3 Vl ,, 1-ft 

-5— P3 - irX plk — X P3k - (3 - 20 P3fc )(l - p 3 ) In 

00 o 1 — P2 



1 - ^-Ps ~ ^X pik ^X P3k - (3 - 26 P3k )(l - &) In T -^- < (23) 



In order to write 16 we need the following: 

Pik p^k 

Y j X 3 = { 1 + 2) Pi k-2Y J 0j and 
3=1 3=1 

Pik pik 3 „ pik pik 



. , . , - , - . k-i + l 

3=1 3=1 %=1 %=1 3=% 

E (p 4 k - i + 1)6 j _\^n ,\^ (P4 -P2)kdj 

«=1 «=1 1=1 

^X k 

Therefore, Condition 16 can be written as 



1 + 0(1) < ~^—P4 + 5" 

1 - pi 3 3 



(P2 - Ps) - (3 - 20 P3fc )(l - pa) In — ^ + o(l) < ^-p 4 + ^-^A P4 * - p 4 (24) 



For j < p4&, Condition 18 can be written in terms of pt's and 7 as follows: 



i=l i=l 
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Oj > 26 P3k - 1 Vj < PA k 
In the last inequality, we substituted Xj in terms of 6j's using Equation 20. Thus, it is sufficient to have: 
G P4 k > 26 P3 k — 1 (25) 

Furthermore, for p^k < j < Pzk 

1=1 1=1 

Oj + P2 k > 2 Vj5 4 fc < j < Psk. 
Again, in the last step we used the Equation 20. Thus, it is sufficient to have: 

6 P3 k + 6 P2 k > 2 (26) 

The last observation here is that 

k n 

P2 k < 1 =K = X! / _',i = Vfc = 7 - 1 + 8 P2 k < 7 

i=l 

Thus, from Inequality 22, we get ( < 7 and it is sufficient to minimize 7 instead of max((, 7). Now, we need 
to find pi < p\ < P3 < pi such that Inequalities 22, 23, 24, 25 and 26 are all satisfied and 7 is minimum. 
Notice that all these recent inequalities are in terms of p^s and 7 (because 0j's have been written in terms 
of them as well as Aj's). Now we can observe that by setting p\ = 0.327, pi = 0.737, p$ = 0.539, and 
Pi = 0.327, all inequalities are satisfied and 7 < 1.939 < 1.94 as desired. 
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